{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "aa0a86859809102",
   "metadata": {
    "collapsed": false
   },
   "source": [
    "# Image Embedding\n",
    "As of version 0.3.0 fastembed supports computation of image embeddings.\n",
    "\n",
    "The process is as easy and straightforward as with text embeddings. Let's see how it works."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "id": "cea8fd5c019571fe",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2024-06-02T11:35:40.126023Z",
     "start_time": "2024-06-02T11:35:39.864701Z"
    },
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Fetching 3 files: 100%|██████████| 3/3 [00:00<00:00, 47482.69it/s]\n"
     ]
    },
    {
     "data": {
      "text/plain": "[array([0.        , 0.        , 0.        , ..., 0.        , 0.01139933,\n        0.        ], dtype=float32),\n array([0.02169187, 0.        , 0.        , ..., 0.        , 0.00848291,\n        0.        ], dtype=float32)]"
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "from fastembed import ImageEmbedding\n",
    "\n",
    "model = ImageEmbedding(\"Qdrant/resnet50-onnx\")\n",
    "\n",
    "embeddings_generator = model.embed(\n",
    "    [\"../../tests/misc/image.jpeg\", \"../../tests/misc/small_image.jpeg\"]\n",
    ")\n",
    "embeddings_list = list(embeddings_generator)\n",
    "embeddings_list"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "3f838f18523ad1e0",
   "metadata": {
    "collapsed": false
   },
   "source": [
    "## Preprocessing\n",
    "\n",
    "Preprocessing is encapsulated in the ImageEmbedding class, applied operations are identical to the ones provided by [Hugging Face Transformers](https://huggingface.co/docs/transformers/en/index).\n",
    "You don't need to think about batching, opening/closing files, resizing images, etc., Fastembed will take care of it."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "894b33ff9b385d72",
   "metadata": {
    "collapsed": false
   },
   "source": [
    "## Supported models\n",
    "\n",
    "List of supported image embedding models can either be found [here](https://qdrant.github.io/fastembed/examples/Supported_Models/#supported-image-embedding-models) or by calling the `ImageEmbedding.list_supported_models()` method."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "id": "6d6a4cbbd2200d14",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2024-06-02T11:40:19.313226Z",
     "start_time": "2024-06-02T11:40:19.309845Z"
    },
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": "[{'model': 'Qdrant/clip-ViT-B-32-vision',\n  'dim': 512,\n  'description': 'CLIP vision encoder based on ViT-B/32',\n  'size_in_GB': 0.34,\n  'sources': {'hf': 'Qdrant/clip-ViT-B-32-vision'},\n  'model_file': 'model.onnx'},\n {'model': 'Qdrant/resnet50-onnx',\n  'dim': 2048,\n  'description': 'ResNet-50 from `Deep Residual Learning for Image Recognition <https://arxiv.org/abs/1512.03385>`__.',\n  'size_in_GB': 0.1,\n  'sources': {'hf': 'Qdrant/resnet50-onnx'},\n  'model_file': 'model.onnx'}]"
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "ImageEmbedding.list_supported_models()"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 2
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython2",
   "version": "2.7.6"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
